Coordinated point-in-time snapshots of multiple computing platforms

ABSTRACT

Coordinating point-in-time snapshots among multiple computing platforms by receiving a notification from a first computing platform agent indicating a first computing platform snapshot time, receiving a notification from a second computing platform agent indicating a second computing platform snapshot time, determining that second computing platform snapshot time is later than the first computing platform snapshot time, notifying the first computing platform agent of the second computing platform snapshot time, and receiving from the first computing platform agent a report of any location in the first computing platform&#39;s data storage to which data were written after the first computing platform snapshot time and responsive to a write request that was made prior to or including the second computing platform snapshot time.

FIELD

Embodiments of the invention relate to controlling computer softwareprocesses in general, and more particularly to coordinatingpoint-in-time snapshots among multiple computing platforms.

BACKGROUND

Conventional data backup techniques often involve periodically takingpoint-in-time (PIT) “snapshots” to preserve the state of data stored bycomputers and virtual machines that are hosted by computers. Typically,a snapshot identifies data storage locations to which data were writtensince the last data backup was performed or since the last snapshot wastaken. Measures may be taken to prevent these data storage locationsfrom being overwritten until a backup of their data is made.

Complex computing systems often involve multiple computer applicationsbeing executed on multiple computing platforms, where the applicationsshare data among them. In such systems each computing platform typicallymanages its own data storage. Conventional data backup techniques asapplied to such systems may require each computing platform to take itsown snapshot of its own data storage. However, as data are sharedbetween the applications, it may be critical to ensure that snapshots ofthe various computing platforms be taken at the same point in time inorder to maintain data consistency between the applications.Unfortunately, the different computing platforms in such systems oftenrequire different amounts of time to create their snapshots.

SUMMARY

In one aspect of the invention a method is provided for coordinatingpoint-in-time snapshots among multiple computing platforms, the methodincluding notifying an agent on the first computing platform of the timewhen a snapshot of the second computing platform's data storage wasperformed, and receiving from the agent on the first computing platforma report of any location in the first computing platform's data storageto which data was written after the time when the snapshot of the firstcomputing platform's data storage was performed and in response to awrite request that was made prior to or including the time when thesnapshot of the second computing platform's data storage was performed.

In another aspect of the invention a method is provided for coordinatingpoint-in-time snapshots among multiple computing platforms, the methodincluding receiving a notification indicating that a snapshotsynchronization time that is later than a time when the snapshot of thecomputing platform's data storage was performed, and reporting anylocation in the computing platform's data storage to which data waswritten after the time when the snapshot of the computing platform'sdata storage was performed and in response to a write request that wasmade prior to or including the snapshot synchronization time.

In other aspects of the invention systems and computer program productsembodying the invention are provided.

BRIEF DESCRIPTION OF THE DRAWINGS

Embodiments of the invention will be understood and appreciated morefully from the following detailed description taken in conjunction withthe appended drawings in which:

FIG. 1 is a simplified conceptual illustration of a system forcoordinating point-in-time snapshots of multiple computing platforms,constructed and operative in accordance with an embodiment of theinvention;

FIGS. 2A and 2B, taken together, is a simplified action diagram of anexemplary method of operation of the system of FIG. 1, operative inaccordance with an embodiment of the invention;

FIG. 3 is a simplified flowchart illustration of an exemplary method ofoperation of a backup server, operative in accordance with an embodimentof the invention;

FIG. 4, which is a simplified flowchart illustration of an exemplarymethod of operation of an agent, operative in accordance with anembodiment of the invention; and

FIG. 5 is a simplified block diagram illustration of an exemplaryhardware implementation of a computing system, constructed and operativein accordance with an embodiment of the invention.

DETAILED DESCRIPTION

Embodiments of the invention are now described, although the descriptionis intended to be illustrative of the invention as a whole, and is notto be construed as limiting the invention to the embodiments shown. Itis appreciated that various modifications may occur to those skilled inthe art that, while not specifically shown herein, are neverthelesswithin the true spirit and scope of the invention.

As will be appreciated by one skilled in the art, aspects of the presentinvention may be embodied as a system, method or computer programproduct. Accordingly, aspects of the present invention may take the formof an entirely hardware embodiment, an entirely software embodiment(including firmware, resident software, micro-code, etc.) or anembodiment combining software and hardware aspects that may allgenerally be referred to herein as a “circuit,” “module” or “system.”Furthermore, aspects of the present invention may take the form of acomputer program product embodied in one or more computer readablemedium(s) having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may beutilized. The computer readable medium may be a computer readable signalmedium or a computer readable storage medium. A computer readablestorage medium may be, for example, but not limited to, an electronic,magnetic, optical, electromagnetic, infrared, or semiconductor system,apparatus, or device, or any suitable combination of the foregoing. Morespecific examples (a non-exhaustive list) of the computer readablestorage medium would include the following: an electrical connectionhaving one or more wires, a portable computer diskette, a hard disk, arandom access memory (RAM), a read-only memory (ROM), an erasableprogrammable read-only memory (EPROM or Flash memory), an optical fiber,a portable compact disc read-only memory (CD-ROM), an optical datastorage device, a magnetic data storage device, or any suitablecombination of the foregoing. In the context of this document, acomputer readable storage medium may be any tangible medium that cancontain, or store a program for use by or in connection with aninstruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signalwith computer readable program code embodied therein, for example, inbaseband or as part of a carrier wave. Such a propagated signal may takeany of a variety of forms, including, but not limited to,electro-magnetic, optical, or any suitable combination thereof. Acomputer readable signal medium may be any computer readable medium thatis not a computer readable storage medium and that can communicate,propagate, or transport a program for use by or in connection with aninstruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmittedusing any appropriate medium, including but not limited to wireless,wireline, optical fiber cable, RF, etc., or any suitable combination ofthe foregoing.

Computer program code for carrying out operations for aspects of thepresent invention may be written in any combination of one or moreprogramming languages, including an object oriented programming languagesuch as Java, Smalltalk, C++ or the like and conventional proceduralprogramming languages, such as the “C” programming language or similarprogramming languages. The program code may execute entirely on theuser's computer, partly on the user's computer, as a stand-alonesoftware package, partly on the user's computer and partly on a remotecomputer or entirely on the remote computer or server. In the latterscenario, the remote computer may be connected to the user's computerthrough any type of network, including a local area network (LAN) or awide area network (WAN), or the connection may be made to an externalcomputer (for example, through the Internet using an Internet ServiceProvider).

Aspects of the present invention are described below with reference toflowchart illustrations and/or block diagrams of methods, apparatus(systems) and computer program products according to embodiments of theinvention. It will be understood that each block of the flowchartillustrations and/or block diagrams, and combinations of blocks in theflowchart illustrations and/or block diagrams, can be implemented bycomputer program instructions. These computer program instructions maybe provided to a processor of a general purpose computer, specialpurpose computer, or other programmable data processing apparatus toproduce a machine, such that the instructions, which execute via theprocessor of the computer or other programmable data processingapparatus, create means for implementing the functions/acts specified inthe flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computerreadable medium that can direct a computer, other programmable dataprocessing apparatus, or other devices to function in a particularmanner, such that the instructions stored in the computer readablemedium produce an article of manufacture including instructions whichimplement the function/act specified in the flowchart and/or blockdiagram block or blocks.

The computer program instructions may also be loaded onto a computer,other programmable data processing apparatus, or other devices to causea series of operational steps to be performed on the computer, otherprogrammable apparatus or other devices to produce a computerimplemented process such that the instructions which execute on thecomputer or other programmable apparatus provide processes forimplementing the functions/acts specified in the flowchart and/or blockdiagram block or blocks.

Reference is now made to FIG. 1 which is a simplified conceptualillustration of a system for coordinating point-in-time snapshots ofmultiple computing platforms, constructed and operative in accordancewith an embodiment of the invention. In the system of FIG. 1 a backupserver 100 is shown configured to communicate with computing platforms102, 104, and 106, although it is appreciated that backup server 100 maybe configured to communicate with any number of computing platforms. Acomputing platform as referred to herein may, for example, refer to acomputer together with its operating system, a virtual machine that ishosted by a computer, or a hypervisor that is hosted by a computer andthat manages multiple virtual machines that are themselves hosted by oneor more computers, although the person of ordinary skill in the art towhich the embodiments of the invention pertain will recognize that othercombinations of computer hardware and software are contemplated in thecontext of the embodiments of the invention. Computing platforms 102,104, and 106 are each preferably configured with an agent 108, a filesystem monitor 110, a disk monitor 112, and data storage 114, which mayinclude one or more physical data storage devices or portions thereof.

Additional reference is now made to FIGS. 2A and 2B which, takentogether, is a simplified action diagram of an exemplary method ofoperation of the system of FIG. 1, operative in accordance with anembodiment of the invention. In this example references to agent 108,file system monitor 110, disk monitor 112, and data storage 114 may beunderstood to refer to any, and preferably each, of their instances oneach of computing platforms 102, 104, and 106. In the method of FIGS. 2Aand 2B backup server 100 instructs agent 108 to perform a snapshot ofdata storage 114 at or about a designated time (step 200), such as byinstructing agent 108 to perform a backup operation which in turn causesagent 108 to perform the snapshot. Preferably, the current time as knownto the operating systems on computing platforms 102, 104, and 106 is thesame, such as where their internal clocks are synchronized in accordancewith conventional techniques. At or about the designated time agent 108instructs its computing platform's operating system or a componentthereof, such as the Volume Shadow Service™ (VSS) where the operatingsystem is Microsoft Windows™, to prepare to perform a snapshot of datastorage 114, such as by performing freeze and flush operations inaccordance with conventional techniques prior to performing the snapshot(step 202). After the freeze and flush operations have been completed,agent 108 instructs file system monitor 110 to intercept all requests onits computing platform to write data to files and record the time ofeach request, while preventing the data from being written to datastorage 114 (step 204). Agent 108 then performs the snapshot of datastorage 114 in accordance with conventional techniques, such as byinstructing its computing platform's operating system or a componentthereof, such as VSS, to perform the snapshot (step 206). After thesnapshot has been completed, agent 108 instructs its computingplatform's operating system or a component thereof, such as VSS, toperform an unfreeze, or thaw, operation in accordance with conventionaltechniques (step 208). Agent 108 then notifies backup server 100 of thetime when the snapshot was performed, which is preferably expressed asthe time when the snapshot was completed (step 210).

Once backup server 100 is notified of the time of each of the snapshotsperformed on computing platforms 102, 104, and 106, backup server 100then determines which of the snapshots of computing platforms 102, 104,and 106 was performed last in terms of the time when each snapshot wasperformed, where this time is now referred to as the snapshotsynchronization time (SST), and notifies agent 108 of the SST (step212). If for any reason agent 108 doesn't receive the SST from backupserver 100, such as within a predefined amount of time after agent 108notifies backup server 100 of the time when its snapshot was performed,agent 108 preferably instructs file system monitor 110 to release anydata write requests that it intercepted, thereby allowing their data tobe written to data storage 114—preferably only to locations to whichdata were not written as indicated in the snapshot—notifies backupserver 100 that it is not participating in the synchronized snapshot asdescribed below, and provides its snapshot information to backup server100, whereupon steps 214-228 below are skipped. Otherwise, agent 108instructs disk monitor 112 to intercept all requests on its computingplatform to write data to data storage 114 (step 214). Agent 108 theninstructs file system monitor 110 to release any data write requeststhat it intercepted prior to and including the SST (step 216). Diskmonitor 112 intercepts the released data write requests (step 218),allows their data to be written to data storage 114—preferably only tolocations to which data were not written as indicated in thesnapshot—(step 220), records the data storage locations to which thedata are written, and reports the locations to agent 108 (step 222).Agent 108 then instructs disk monitor 112 to stop intercepting datawrite requests (step 224). Agent 108 then instructs file system monitor112 to release any remaining data write requests that it intercepted andthen stop intercepting data write requests (step 226).

Agent 108 provides its snapshot information to backup server 100, aswell as the locations to which data were written to data storage 114after the time when the snapshot of data storage 114 was performed,where the requests to write the data were intercepted by file systemmonitor 112 prior to and including the SST (step 228). These locationsto which data were written after the snapshot was performed, where therequests to write the data were intercepted prior to and including theSST, may be reported by agent 108 separate from the snapshot, or,alternatively, the snapshot itself may be updated to reflect thisinformation.

Backup server 100 may, in accordance with conventional techniques,backup the data storage of any of computing platforms 102, 104, and 106using their snapshots, together with the information regarding thelocations to which data were written after any of the snapshotsperformed, where the requests to write the data were intercepted priorto and including the SST.

Reference is now made to FIG. 3, which is a simplified flowchartillustration of an exemplary method of operation of a backup server,such as backup server 100 of the system of FIG. 1, operative inaccordance with an embodiment of the invention. In the method of FIG. 3an agent on a first computing platform and an agent on a secondcomputing platform are instructed to perform snapshots of theircomputing platform's data storage at or about the same designated time(step 300). A notification is received from the agent on the firstcomputing platform indicating the time when the snapshot of the firstcomputing platform's data storage was performed (step 302). Anotification is also received from the agent on the second computingplatform indicating the time when the snapshot of the second computingplatform's data storage was performed (step 304). The snapshotsynchronization time (SST) is determined as the time when the snapshotof the second computing platform's data storage was performed if it islater than the time when the snapshot of the first computing platform'sdata storage was performed (step 306). The agent on the first computingplatform is notified of the SST (step 308). In addition to receiving thesnapshots of the first and second computing platforms from the agents, areport is received from the agent on the first computing platformregarding any location in the first computing platform's data storage towhich data were written after the time when the snapshot of the firstcomputing platform's data storage was performed and responsive to awrite request that was made prior to or including the SST (step 310).

Reference is now made to FIG. 4, which is a simplified flowchartillustration of an exemplary method of operation of an agent, such asagent 108 of the system of FIG. 1, operative in accordance with anembodiment of the invention. In the method of FIG. 4 a snapshot isperformed of a computing platform's data storage (step 400) whileensuring that data write requests are intercepted and prevented frombeing written to the computing platform's data storage (step 402). Thetime when the snapshot was performed is reported to a backup server(step 404). A notification is received from the backup server of asnapshot synchronization time (SST) that is later than the time when thesnapshot of the computing platform's data storage was performed (step406). Data of intercepted data write requests that were made prior to orincluding the SST are allowed to be written to the computing platform'sdata storage (step 408), and their locations in the computing platform'sdata storage are reported to the backup server (step 410), eitherseparately from the snapshot or by incorporating the locationinformation into the snapshot.

Any of the elements described herein are preferably implemented inaccordance with conventional techniques in computer software embodied ina non-transitory, computer-readable storage medium and/or in computerhardware.

Referring now to FIG. 5, block diagram 500 illustrates an exemplaryhardware implementation of a computing system in accordance with whichone or more components/methodologies of the embodiments (e.g.,components/methodologies described in the context of FIGS. 1-4) may beimplemented, according to an embodiment of the invention.

As shown, the techniques for controlling access to at least one resourcemay be implemented in accordance with a processor 510, a memory 512, I/Odevices 514, and a network interface 516, coupled via a computer bus 518or alternate connection arrangement.

It is to be appreciated that the term “processor” as used herein isintended to include any processing device, such as, for example, onethat includes a CPU (central processing unit) and/or other processingcircuitry. It is also to be understood that the term “processor” mayrefer to more than one processing device and that various elementsassociated with a processing device may be shared by other processingdevices.

The term “memory” as used herein is intended to include memoryassociated with a processor or CPU, such as, for example, RAM, ROM, afixed memory device (e.g., hard drive), a removable memory device (e.g.,diskette), flash memory, etc. Such memory may be considered a computerreadable storage medium.

In addition, the phrase “input/output devices” or “I/O devices” as usedherein is intended to include, for example, one or more input devices(e.g., keyboard, mouse, scanner, etc.) for entering data to theprocessing unit, and/or one or more output devices (e.g., speaker,display, printer, etc.) for presenting results associated with theprocessing unit.

The flowchart and block diagrams in the Figures illustrate thearchitecture, functionality, and operation of possible implementationsof systems, methods and computer program products according to variousembodiments of the invention. In this regard, each block in theflowchart or block diagrams may represent a module, segment, or portionof code, which comprises one or more executable instructions forimplementing the specified logical function(s). It should also be notedthat, in some alternative implementations, the functions noted in theblock may occur out of the order noted in the figures. For example, twoblocks shown in succession may, in fact, be executed substantiallyconcurrently, or the blocks may sometimes be executed in the reverseorder, depending upon the functionality involved. It will also be notedthat each block of the block diagrams and/or flowchart illustration, andcombinations of blocks in the block diagrams and/or flowchartillustration, can be implemented by special purpose hardware-basedsystems that perform the specified functions or acts, or combinations ofspecial purpose hardware and computer instructions.

It will be appreciated that any of the elements described hereinabovemay be implemented as a computer program product embodied in acomputer-readable medium, such as in the form of computer programinstructions stored on magnetic or optical storage media or embeddedwithin computer hardware, and may be executed by or otherwise accessibleto a computer.

While the methods and apparatus herein may or may not have beendescribed with reference to specific computer hardware or software, itis appreciated that the methods and apparatus described herein may bereadily implemented in computer hardware or software using conventionaltechniques.

While the embodiments of the invention has been described, thedescription is intended to be illustrative and is not to be construed aslimiting the invention to the embodiments shown. It is appreciated thatvarious modifications may occur to those skilled in the art that, whilenot specifically shown herein, are nevertheless within the true spiritand scope of the embodiments.

What is claimed is:
 1. A method for coordinating point-in-time snapshotsamong multiple computing platforms, the method comprising: notifying anagent on a first computing platform of a time when a snapshot of asecond computing platform's data storage was performed; and receivingfrom the agent on the first computing platform a report of any locationin the first computing platform's data storage to which data was writtenafter the time when the snapshot of the first computing platform's datastorage was performed in response to a write request that was made priorto or including the time when the snapshot of the second computingplatform's data storage was performed.
 2. The method according to claim1, further comprising instructing the agents to perform their snapshotsat or about the same designated time.
 3. The method according to claim1, further comprising receiving from the agent on the first computingplatform the snapshot of the first computing platform's data storage;and performing a backup of the first computing platform's data storage,wherein the backup is performed using: the snapshot of the firstcomputing platform's data storage; and the report of any location in thefirst computing platform's data storage to which data were written afterthe time when the snapshot of the first computing platform's datastorage was performed and responsive to a write request that was madeprior to or including the time when the snapshot of the second computingplatform's data storage was performed.
 4. The method according to claim1, wherein the receiving, determining, and notifying are implemented inany of (a) computer hardware, and (b) computer software embodied in anon-transitory computer-readable storage medium.
 5. A method forcoordinating point-in-time snapshots among multiple computing platforms,the method comprising: receiving a notification indicating that asnapshot synchronization time is later than a time when a snapshot of afirst computing platform's data storage was performed, the snapshotsynchronization time equaling a time when a snapshot of a secondcomputing platform's data storage was performed; and reporting anylocation in the first computing platform's data storage to which datawere written after the time when the snapshot of the first computingplatform's data storage was performed and in response to a write requestto the first computing platform's data storage that was made prior to orincluding the snapshot synchronization time.
 6. The method according toclaim 5, further comprising recording in the snapshot of the firstcomputing platform's data storage any location in the first computingplatform's data storage to which data were written after the time whenthe snapshot of the first computing platform's data storage wasperformed and in response to a write request to the first computingplatform's data storage that was made prior to or including the snapshotsynchronization time.
 7. The method according to claim 5, whereinreporting the time when the snapshot of the first computing platform'sdata storage was performed comprises reporting the time when thesnapshot of the first computing platform's data storage was completed.8. The method according to claim 5, further comprising: causing allrequests to write data to the first computing platform's data storage tobe intercepted by a first interceptor; causing the time of eachintercepted request to be recorded; and preventing the data of eachintercepted request from being written to the first computing platform'sdata storage, wherein the causing and preventing are performed prior toperforming the snapshot of the first computing platform's data storage.9. The method according to claim 8, further comprising: causing allrequests to write data to the first computing platform's data storage tobe intercepted by a second interceptor; allowing the data of eachrequest intercepted by the second interceptor to be written to thecomputing platform's data storage at write locations; causing the writelocations to be recorded; and causing the first interceptor to releaseeach intercepted request that was intercepted prior to and including thesnapshot synchronization time.
 10. The method according to claim 5,wherein the performing, reporting, and receiving are implemented in anyof (a) computer hardware, and (b) computer software embodied in anon-transitory, computer-readable storage medium.
 11. A system forcoordinating point-in-time snapshots among multiple computing platforms,the system comprising: an agent on a first computing platform; an agenton a second computing platform; and a backup server configured to:notify the agent on the first computing platform of the time when asnapshot of the second computing platform's data storage was performed,and receive from the agent on the first computing platform a report ofany location in the first computing platform's data storage to whichdata were written after the time when the snapshot of the firstcomputing platform's data storage was performed and in response to awrite request that was made prior to or including the time when thesnapshot of the second computing platform's data storage was performed.12. The system according to claim 11, wherein the backup server isfurther configured to instruct the agents to perform their snapshots ator about the same designated time.
 13. The system according to claim 11,wherein the backup server is further configured to receive from theagent on the first computing platform the snapshot of the firstcomputing platform's data storage; and perform a backup of the firstcomputing platform's data storage, wherein the backup is performedusing: the snapshot of the first computing platform's data storage andthe report of any location in the first computing platform's datastorage to which data were written after the time when the snapshot ofthe first computing platform's data storage was performed and responsiveto a write request that was made prior to or including the time when thesnapshot of the second computing platform's data storage was performed.14. The system according to claim 11, wherein the first agent and secondagent are configured to: perform a snapshot of a computing platform'sdata storage, report a time when the snapshot was performed, receive anotification indicating that a snapshot synchronization time that islater than the time when the snapshot of the computing platform's datastorage was performed, and report any location in the computingplatform's data storage to which data were written after the time whenthe snapshot of the computing platform's data storage was performed andin response to a write request that was made prior to or including thesnapshot synchronization time.
 15. The system according to claim 14,wherein the first agent and second agent are further configured torecord in the snapshot of the computing platform's data storage anylocation in the computing platform's data storage to which data werewritten after the time when the snapshot of the computing platform'sdata storage was performed and in response to a write request that wasmade prior to or including the snapshot synchronization time.
 16. Thesystem according to claim 14, wherein times when the snapshots wereperformed are the times when the snapshots were completed.
 17. Thesystem according to claim 14, wherein the first agent and second agentare further configured to: cause all requests to write data to thecomputing platform's data storage to be intercepted by a file systemmonitor prior to performing the snapshot, cause the time of eachintercepted request to be recorded by the file system monitor; and causethe file system monitor to prevent the data of each intercepted requestfrom being written to the computing platform's data storage.
 18. Thesystem according to claim 17, wherein the first agent and second agentare configured to: cause all requests to write data to the computingplatform's data storage to be intercepted by a disk monitor, cause thedisk monitor to allow the data of each request intercepted by the secondinterceptor to be written to the computing platform's data storage atwrite locations, cause the write locations to be recorded, and cause thefile system monitor to release each intercepted request that wasintercepted prior to and including the snapshot synchronization time.19. The system according to claim 11, wherein the backup server and thefirst agent and second agent are implemented in any of (a) computerhardware, and (b) computer software embodied in a non-transitory,computer-readable storage medium.